Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: resolve divide by 0 error when uploading empty dataframe #252

Merged
merged 11 commits into from
Feb 26, 2019

Conversation

bwanglzu
Copy link
Contributor

For issue #237 .

A quick fix to resolve ZeroDivisionError while uploading an empty dataframe.

@bwanglzu bwanglzu changed the title resolve divide by 0 error when uploading empty dataframe BUG: resolve divide by 0 error when uploading empty dataframe Feb 22, 2019
@max-sixty
Copy link
Contributor

Thanks a lot @bwanglzu

Would you be able to add a test? It should be very short

@bwanglzu
Copy link
Contributor Author

@max-sixty definitely.

@@ -246,6 +246,28 @@ def test_to_gbq_doesnt_run_query(
mock_bigquery_client.query.assert_not_called()


def test_to_gbq_uploading_empty_dataframe(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very reasonable to copy code also testing writes, but this is doing something slightly more involved re pandas versioning

Could you copy the code here: https://github.com/pydata/pandas-gbq/blob/272aa7bebef5dc99869e7c44adf8011258c8d7c9/tests/system/test_gbq.py#L879 and put the test there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @max-sixty thanks for your feedback. Two questions:

  1. Shall I create a test method in pandas-gbq/tests/system/test_gbq.py named test_upload_empty_data, or just create an empty pd.DataFrame within test_upload_data? Which one is better?
  2. Do I need to delete test_to_gbq_uploading_empty_dataframe in pandas-gbq/tests/unit/test_gbq.py?

Thanks!

Copy link
Contributor

@max-sixty max-sixty Feb 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall I create a test method in pandas-gbq/tests/system/test_gbq.py named test_upload_empty_data

This would be perfect!
That's the only test that's needed (no need to keep the existing test_to_gbq_uploading_empty_dataframe)

@max-sixty
Copy link
Contributor

Great!
Do you want to add a changelog note? You're welcome to add attribution to yourself too

@bwanglzu
Copy link
Contributor Author

bwanglzu commented Feb 23, 2019

Hi @max-sixty thanks for your guidance! If needed I'll add the changelog note.

Pandas-gbq really helped me a lot, thanks for your great work!

Best

@bwanglzu
Copy link
Contributor Author

hi @max-sixty I did find out docs/source/changelog.rst, but i'm not sure where to add the changelog, can I put it under 0.9.0 / 2019-01-11?

@max-sixty
Copy link
Contributor

Hi @bwanglzu - under 0.10.0 (0.9.0 was already released), under a new section "Bug fixes". Thanks!

@max-sixty
Copy link
Contributor

@bwanglzu I think you need to merge master into your branch - the current changes to changelog.rst are on an older version. This is the current master; https://github.com/pydata/pandas-gbq/blob/master/docs/source/changelog.rst

@bwanglzu
Copy link
Contributor Author

@max-sixty Oh sorry I didn't see your message.. I directly resolved the conflict on github UI

@@ -1,6 +1,14 @@
Changelog
=========

.. _changelog-0.11.0:

0.11.0 / 2019-02-25
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should still be in 0.10.0!

@max-sixty max-sixty merged commit 8a26345 into googleapis:master Feb 26, 2019
@max-sixty
Copy link
Contributor

Thanks a lot @bwanglzu ! Happy to have you as a contributor

@bwanglzu
Copy link
Contributor Author

@max-sixty Thanks for your help!

@bwanglzu bwanglzu deleted the bug/divide-by-zero branch February 26, 2019 15:09
def test_upload_empty_data(self, project_id):
test_id = "data_with_0_rows"
test_size = 0
df = DataFrame()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we might have an additional problem when the DataFrame contains no columns.

In the conda build (https://circleci.com/gh/tswast/pandas-gbq/276) I'm getting:

E           google.api_core.exceptions.BadRequest: 400 POST https://www.googleapis.com/upload/bigquery/v2/projects/pandas-gbq-tests/jobs?uploadType=resumable: Empty schema specified for the load job. Please specify a schema that describes the data being loaded.

Since we still create a table in pandas-gbq before running the load job, we can probably avoid doing the load job altogether when a DataFrame contains no rows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants